Toward a Memory-Centric, Stacked Architecture for Extreme-Scale, Data-Intensive Computing
نویسندگان
چکیده
One of the primary concerns of performing efficient data-intensive computing at scale is the inherent ability to exploit memory bandwidth on a local and global scale. The traditional computer architecture inherently decouples the processing interconnect from the memory interconnect, thus preventing efficient, parallel utilization of both at scale. Further, the orthogonal nature of these board-level and systemlevel interconnects force users to build messaging libraries to bridge the gap between local and global memory access. The result is a system architecture that fails to efficiently expose the necessary locality of non-deterministic data patterns for algorithmic exploitation at scale. In this position paper, we present a theoretical architecture based upon processing near memory using a 3-dimensional stacked device component as the central processing and networking component. We utilize the second generation, Hybrid Memory Cube device architecture, packet specification and interconnect as the basis for our architecture in order to dramatically increase the compute-to-bandwidth relationship by augmenting the logic layer of the HMC devices with actual processing elements. We also utilize the same core component to serve as the interconnection mechanism between multiple processing elements such that memory and communication become a homogeneous function of the platform. Finally, we explore two potential algorithmic exploitation methodologies that could be used to efficiently implement non-deterministic data analytics problems at scale.
منابع مشابه
On Processing Extreme Data
Extreme Data is an incarnation of Big Data concept distinguished by the massive amounts of data that must be queried, communicated and analyzed in near real-time by using a very large number of memory or storage elements and exascale computing systems. Immediate examples are the scientific data produced at a rate of hundreds of gigabits-per-second that must be stored, filtered and analyzed, the...
متن کاملOptimizing power efficiency for 3D stacked GPU-in-memory architecture
With the prevalence of data-centric computing, the key to achieving energy efficiency is to reduce the latency and energy cost of data movement. Near data processing (NDP) is a such technique which, instead of moving data around, moves computing closer to where data is stored. The emerging 3D stacked memory brings such opportunities for achieving both high power-efficiency as well as less data ...
متن کامل3D-Stacked Memory-Side Acceleration: Accelerator and System Design
Specialized hardware acceleration is an effective technique to mitigate the dark silicon problems. A challenge in designing on-chip hardware accelerators for data-intensive applications is how to efficiently transfer data between the memory hierarchy and the accelerators. Although the Processingin-Memory (PIM) technique has the potential to reduce the overhead of data transfers, it is limited b...
متن کاملA 3D-Stacked Architecture for Secure Memory Acquisition
Many security and forensic analyses rely on the ability to fetch memory snapshots from a target machine. To date, the security community has relied on virtualization, external hardware or trusted hardware to obtain such snapshots. We show that these prior techniques either sacrifice snapshot consistency or impose a performance penalty on applications executing atop the target. We present a new ...
متن کاملGray's laws: database-centric computing in science
he explosion in scientific data has created a major challenge for cutting-edge scientific projects. With datasets growing beyond a few tens of terabytes, scientists have no off-the-shelf solutions that they can readily use to manage and analyze the data [1]. Successful projects to date have deployed various combinations of flat files and databases [2]. However , most of these solutions have bee...
متن کامل